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ABSTRACT 

Speaking on the operating policies and procedures of 
computer data bases containing information on students, the author^ 
divides his remarks into three parts: content decisions, data base* 
security, and user access* He cffers nine recommended practices that 
should increase the data tase^s usefulness to the user community: (1) 
the (fast cf developing and maintaining a computer data base must be 
iustified in terms of valid, ongoing research questions and a large 
student population: (2) the data base should contain only accurat^^ 
information necessary for answering important well-designed 
questions; (3) a formal, identifiable "body" anst decide on the data 
base^s content and the student population to be included in the data 
base:. identification numbers, rather than student names should be 
put^on each record: (5) the coding of data must be compatible with 
the'data base structure and consistent across years; (6) accurate,, 
up-to-date documentation of the data base contents and coding scheme 
must be maintained: (7) the number of persons whp have direct access 
+0 thfe data base must te severely limited: (8> ^several backup copies 
of the data base most be maintained: and (9) a system of priorities 
,for perfcrmina requested analyses must be developed. (Author/IET) 
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The operating Dolicies and procedures of a computerized student data- 
base uUifTiately help det^ermine the usefulness of that data-base. Careful 
consideration must be given to rrtany aspects of th^t data-base: iVho'decides 
the variables to be included in the system? Wh i cK Va r i all) es should they be? 
How will that data 'be coded? ^/Ko wi U have access to the data -base for 
^na J ys i s purposes? Who will ha^e access for dat^ ent ry and data deletion? 
Ho\-i wM t unauthorized use of the data-base be prevent^erl? IJ^ich'users will 
have priority? Should there be a Steering Committee? >.'hich data should 
'rem|in confidential and v^hc^t means ere' ^va i 1 ao le to Insure that? \iho pays for 
the requested analyses? 

Different approaches to these inoortant concerns a-^e being taken by 
various institutions and can offer guidance to others interested in devfe^oping 
or improving their own student data"base system, Tnese concerns can be 
adcTre^seci under the rubrics of Content Oecis^ons, Coding Decisions, Data-Base 
Security and User Access. 

Content Decj s ? ons 

It is important that the data coUected and stored in the datarbase be 
relevant to the issue hand. This can include demograp^h i c , attitudinal and 
achifevenent information. Without some clear-cut guidelines to regulate 
content^ it becomes easy to fall into the tra^i of gathering and storing every 
bit of infor-mation about the student popu la t ior> wh i ch is 5vai)5b.le. This 
amount of' daia* soon becomes unwieldy and expensive in terms of the time and 
effort spent col 1 ect i ng , cod i ng and" s tor i ng the i rre 1 evant , unusab I e i nfor - 
mation. Typilcal. data-base prrfornation would include the following: 



OEMOGI^APHtC INFORMATION 



Student name (not reconmen'ded ) 

1 dent i f i'cat ion number 

Sex 

Date of bi rth ■ , ' ■ 

^rtar^taJ status ■ 

Fathers occupation code " j 

Undergraduate <^eqree level " / , 

Undergraduate school code f 

Undergraduate major / 

Summer emp loyment codes 

Ext r^cu^'ri cul ar act ivi ty codes 

County of ^residence 

Year of Hat r^i^ 1 at ion intiD the program 



ACHIEVEHEHT IHFORHATIQN ^ * " V 

^ Undergraduate GPA*s . ♦ 

Admi ss ion ^ s Tes t subscores 
Class rank 

■ " Exam scores for each course/grades'" 

MacionaJ/State certification test scores 

Date of graduation from the program , ' ■ 

$equence of courses through the program ' . * 

m t ' 

s 

■* h . ' 

^ AtTtTUptNAL INFOrWiOH ; ^ ^ 

' \ Orientation Day expectations inventory responses ' ." 
Career att i tude inventory responses 
I f^st rucc iona 1 ratings o^^ courses/instructors 
Gra^juat Jon-'ex 1 1 ques 1 1 onna I f:e responses . ^ 

Post -graduat Ton qves t i onna i re responses \ 
■ •* * 

These content dec^^ions most of course, be in t^oreement vjith all pertinent 
federal and st>3te legislation governing individual orivacy and informed 
consent. ■ ^ ' ♦ ' / ' * 

, Hany coTiputer data-bases , for the s^e of ^conf i dent t a I ! ty , do not cont a i n 
actual student names, only student ident'if icat ion numbers. The *correspond"ence 
between r^ames and numbers \s then kept \^ written form in a noteboo^:«loc3ted 
at a separate, secure location. Two l^ists of name/number pairing should be 
malntafned: first sorted bV_ number, then sorted by na'^e. ■ ^ 

Not ociiy must the' specific type^ of Information to be stored be cjecided, 
but also the/ ronge of the student population must be considered. is the popLtia- 
tion to be dnly the current i y^^n ro 1 l6d students or all students 'ever enrolled in 
the program, .even for_ a short period^of time? If the program has a long-history, 
perhaps data coding should start with the current' c ) ass ,. or sOme specified' 
ear f ier class . 

These types of content decisions are of vital import^nc-e in determining 
thfe scope and ultimate utility of the dat^-base.^ Maintaining the correct data 
on >the correct population for each sTudy Is maximal Iv^ cost-effect ive* poing 
back severa t t imes' through wrTtten student fol der s tb col i ect t hose add 1 1 1 ona 1 
pieces of information not previously thouqht 'important Is slow and costly. 

Because th^se cont^nty deci s i ons are^o i'mportant', a clearly deffnajj "body'/ 
must^ have that authori ty. / That ^''body/' can r^inge frpm the University President' 
to the data key-puncher. /The f f gure"t)e Sow shows a J;iierarchy of- poss ible ' 
deci s lon-^makers . Most emjcationat offices jn-D i nt^i In i nq a data"base have some 
sort of s teer i ng cOfnmi t tM oversee i hg its oper^it i ons and se 1 1 1 ng i t s po 1 t cy and 
objectives. One such fnetiical education office has a"^ Data-'Base $teering Board 
consisting of 3 faculty/members, the Associate Dean, the Associate Registrar, 
and the Oi rector of the da ta';base Vf f 1 ce; Another med i ca I e due at ion of f ice J^as 
no steering commi 1 1 ee , /leav Lng the of f i ce ^ dj rec tor and data^base /nanager solely 
re*sponsible for c*-teryt xJecl s Ions\ The dec i s lon-mflk i ng '*body)' should be aS 



broad as the user popjilatlon (those people planning to^ use the data-bBse J rkfor ^ 



ma [ i on ) , * j X 



State Leg i s IciLure 



Board of Regents 



University President 



Col lege Oean 



Dean^ :, Staff Conyii t^fee 
(Dean , Assoc i ate Dean ^ s 
OePaptJTient Hjeads , etc. ) 



Of f i ce Stgeri noi Commi t tee 
. (ie. Office of Medical EcLuc^tion 

Steer f ng ' Board; n^ay consist 
, * of Faculty, Students, Department 
Heads , Adrrvi ni s t rat i on , Of f i ce St;i f f ) 



Oata-8ase Stee 


r i ng . Comni tt'Se 


Qff if:e 


0\ rec tor 







Data-Base Man^ager 



Prograrnpier /Analyst 



Coder/Keypupcher 



Codi ng Dec i s i ons 

Once dec J s ions* have been made as to the specific data to be incorporated 
into the computer data'base* its coding mus^: be considered. The coding format 
witl be depender^fe- upon the organ i zaC ior^a 1 structure of the data^base itsetf. 
This structure can range from simply punched dat^ cards stored in a card filer 
to cards read '^as-is*' onto magnetic tape, to hioKly complex data-base organiza" 
tions specific to the computer hardware in use at that institution. Because the 
data-bas5 manager and programmer/analyst are mo;&t fapiiliar with the data-base 
structure, they. should decide on the ^ppr'opriatj'e cading format. 



Whatever the data-base structure, there sire two requirements comh^on to 
First, each record must have SQjne ident t f i Ccit/on i eld containing a unique 



all 



number associating, that data record with a s 



ec i ri c 



i nd i vidua 1 . Secondly , i f an 



f 



i nd I vi dua r Kas several records, these records must be numbered uniquely. That 
\s to. say, each data record must have at ie.3St an i dent i f i cat i on- number referring 
to a specific student, ^ and may» if necessary, have^ a record nunber. 

(5tudej»t ID) (Record No. ) (Spet i f iecJ Informat'ion) - ' 



10001 ^ 01 

10001 02 

^ 10003 ^ 01 

U005 ■ 01 

' II005 ' 02 



Host data Is^'coded and punched on 30-by:e datfi cards ^ although, for some 
high-volume projects, datci inout is done directly through computer terminals. 

Variables to consider vjhen deciding Sl^out codeine; formats tncludc: the type 
of data, the frequency with v^hich d.ita will b^; -cofiectad on o regular basts, 
and the intended tJse of the d-ita. ^f student nane.s are e^Scered into Che dat'a 
base, allov; a sufficienc^ numbe r of co I unnns, fer t he* whole nane . , As na rr i ed* 
students take on hyphenated last nanes^> this f I e I d-^M d th rerqu I renent i s expand- 
ing (i*0 colunns is sufficient), / ' t 

\f t he same da t a ^^-^i I ^ t^e coded^ on regu I a r occas i ons (say on subsequent 
classes of students) the watchword is COJ!S 1 Slf flCV . The sam^ type of data 
should be coded in the sarrie cotunns across occasions. In many n^edical educa- 
:tion data-basesMthi s is not the case. Because new^ revised questionnaires are 
used with each gradual I1T9 class^, the same question and associated answers are 
locatecj .Tt different spoLs on the quest ionnoi re ond associated Punched card, 
"it^becorx^s a masterful task indeed, to combine several clfi5ses of data v^hen the 
relevant data is sometimes \n column 6, or 8^ or 6^*! Careful pre-pl cinn i n(^ of 
delta col)<^ction instruments and codinq formats will Drevent ior at^ least 
mmimi2ej this" poss J bi i i ty . 

In addition^ it is imperative that accurate, up-dated documentation be 
maintained of the data^base contents and the coding scheme used, for each r'ecord. 
Too much document at ion is never 3 orobl em. Ke'ep a file of all data collection 
instruments^ codini^/keypunchinq 1 ns t^ruct i ons and a printed listing of the data- 
base* Being unable to locate a qiven olece^of dacs which is known to be on the 
data"base Is very f rusLtrat ing; likewise, being unable to identify a piece^f 
data you do locate can be disconcerting," 

♦ 

Gi ven the opportun i ty to he 1 p des ign the da't<i col lection^ inst rumen t^ (i.e., 
* questionnaire), provide thoughtful concerned th;€ e^jse of cTod i ng Or direct key- 
ounching. If the form is designed correctfy, coding can etirrjnated and dat<3 
can he diVei^t ly ^'keypunchedj tluis saVi*"ng both ^;<penSe and a poss i bl,e ^ source of 
, inaccurate data. * * * - ^ 



r 



Ail data coded* and keypunched shou It) be ,vcr i f ied for- accuracy. It is much 
easier to corr^ect i naccurate* data at tim'aof eijtry rather ih.in after it has ^ 
be&n i nciuded in the da^ta base . For ^ add i t iona I accuracy , ^ave another person 
compare each lineof the coding sheets to , t he <Jr i q i na 1 daCa^ and^then compare 
the punched" c^ards ,wi th the coding sheets, .lastly, insure .thal-^Jl^^cards are" 
actijal ^y atid cor reel 1 y entered Into the data-base i tsej f , by ' 1 is t ing out the 
data-base contents foe verification. ' - . ■ 



For those who m^y l^e receiving punchedrdata c^irds from sourc^Ss outside 
your own office for input to the data-base^ pre-^consul Cat ion is a necessary but 
not suf f ! c I ent zonH \ tJOn to insure compa tdb i 1 i t y be t ween I hp i ncom i nq data cards 
and the data ba^e. However, the proqr^iintner/^-ifui 1 ys t can "cosily" knock out a, 
snail re fonnat i ng/recod i ng computer program to handle the non-standard format!! 

Oata-Base Securi ty 

In, order for- the /mformat ion co(*itajned in the d^jt^-base lo remain confiden- 
tial *)nd under your control^ it n^ust be "secured". This can be handled through 
front^end p^issv/ord requLrefrients ^ind 1 iRt ted user access to the data-base. This 
1 i ni ted access strategy will ^be d i scussed* t n the next sec t ion af th i ^ paper , 

Typically, a university com;)uter cen ter^ requ i res input oT a secret billing 
nunber and password to directly ^iccess the conputer s/sten. This is the most. 
f)enc J n I (nnd ntos t easily cl rcu-^venled ) sccif r 1 1 y i avx' r . F i ner I nyer ^ i nc I ude 
add i t iona I' passwords tD access da t i-b-^se. to rev>d> port i on s of the da to -base file, 
and finally to write or delete- records. It seens obvf05is ^hat these DaSswords' 
should be [:nov^n to only selected personnel with a true ''need to kno//'. These 
Dasswords should be changed periodicslly to -decrease the chance of a 'leak' \ 
However, be sure the possv^ords are recorded for re f eren;^ - i n the event of 
faculty Tieriory or. the death of the only person who knov/s them. 

For those u 1 r ra-secur I ty-conscious persons, some conauter systerr^s have the 
capacity to scrambli^ the datc^ to make it "non-sense" to the casual onlooker^, 
Aga in, be sb le to gnscrambl e it for ana 1 ys i ^ purposes , 

Maintain several balk up copies of your dfl ta-base . !t Is typical to have 
two current versions on nnagnetic tape stored at the computer center, perhaps 
an older version stored there also, anfi at least one' cur rent ' vers >on stored at 
another secure location. There have been fires at corr^puter centers with resul* 
tant Ic^sses of computer tanes ! If punched cards are used to enter data into 
your da-ta-base, these can serve as a suitable backup soured in most circum- 
stances ^ but shou Id be s tor ed e I sewTie re , pre fer abl y .of f "canpus , 

Another aspect of data-base security^, often overlooked; is the trustworth- ' 
iness of the data coder and keypuncher. The data coder* {and sofnetirrjes the 
key pun Che r} v/i 1 1 be hand 1 i n^ conf i dent ial, very ide ntifiabie i nf orma t ion , They 
must be if^arned of its confidential nature. Indeed, if you farm out your, key ^ 
punching ,^ make sure the data are aropyrnous , i den t J f i ^^N^j^" coded 10 only. 

^ ; U^er Access ' - 

■ CI OS el y re i tJted to data-base securi ty is the needto limit user access r 
An imoortanL distinction must be made^between the handlers of the data-b^se it* 
self., and the use^rs of data-base information.' Only the data"base nianage.r and 
, perhaps the programnier/ana 1 yst should have direct access to the data-base, ' ^ 
Other persons should have to go through these two people to have their repoesCs 
for anaiyses fulfliled,' -The data-base staff will retrieve the appropriate data, 
perform the required analyses and then return' the printed resAJlts or typed 
reports to t^e person requesting the analysis. , 

^ K. - 

tiecause of the sensitive nature of some of the data, potential users 'of 
the data-base information should be, screened , 'and stich requests shouid be 



prioritized. The same "body" that decided on contfint oFten-setS" these pHorities 
^nd ■ ) iaii t a t i ons . ^ 

Requesl^s can coiie from several routes, but all should resu>U in a printed 
recjuest form. This form should be initialed to indicate (l) approv.^l for analysis, 
(2) completfpn of an^ilysis, and (3) when' the results are returned to the requestor; 
and (h)' eventuajly filed. 

Cost recovery: who will pay for^these -inal/ses? At sone offices, legitirnate 
.requests are handled ''free"j the total data-base office budget coming soVe,ly from 
the college. in other of f i ces , "depar tnent s and funded projects-pay at! associated 
requested analysi.4^ costs, increnentinq the budget of the data'ifese office. This 
^ policy will probably be decided at the Steering Co^ittee lever or a higher campCil 
le ve 1 . ^ 



Cone ! us i ons 



Allhou^h no l\:o conou-Ui' da L o - b'^iis es ..Ml havo^xacUy tfie sojpe environment, 
ourposeSj format or content, it rer^.ilns possible to Mst several recommended 
practices which wi M increase its usefulness 10 ih^ user coinr^^uni ty. 

!. The cost of develoDirfg and i n t <i I n i ng" -1 cor^puter d3ta-base must be 
justified in terms of valid on-going research questions and 3 .large 
s tudent popu i af i on . _ ^ 

2 . ' The da t a "base shou id con ta i n- on ) y accara te i n forma t ion necessary for 
' answering important well-designed questions. 

% . ' ^ 

3. An j'den t i f i ab 1 e "feody'^ must decide on the dnta-bose content ond the 
student population to be included in ^he dita-base, 

, * 

4 . Oo not ( nc 1 ude .s tudent names in t he da t a -base ; do- use i dent i f i cat i on 
^ numbers on e<ich recorxJ. 

5 ■ The cod i ng of d^ta must be co^pa t i b I e w i-ch the <^at a-base st ructure and 
■ consistent across years. 

'6. Ha'int^ain accurate, ur>-to-.date documentation of the data-base contents 
^ and coding scher^a. - ^ 

7. Severely limit the f^umber of persons who. have direct access to the 
data-base . ' * 

y 

8* Maintain several backup copies of your dat5-base. 
f ■ * ,K 

9. Develop a system of priorities for pcrformi-nq requested analyses. 
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